42 research outputs found

    An algorithm for accurate taillight detection at night

    Get PDF
    Vehicle detection is an important process of many advance driver assistance system (ADAS) such as forward collision avoidance, Time to collision (TTC) and Intelligence headlight control (IHC). This paper presents a new algorithm to detect a vehicle ahead by using taillight pair. First, the proposed method extracts taillight candidate regions by filtering taillight colour regions and applying morphological operations. Second, pairing each candidates and pair symmetry analysis steps are implemented in order to have taillight positions. The aim of this work is to improve the accuracy of taillight detection at night with many bright spot candidates from streetlamps and other factors from complex scenes. Experiments on still images dataset show that the proposed algorithm can improve the taillight detection accuracy rate and robust under limited light images

    License plate localization based on statistical measures of license plate features

    Get PDF
    — License plate localization is considered as the most important part of license plate recognition system. The high accuracy rate of license plate recognition is depended on the ability of license plate detection. This paper presents a novel method for license plate localization bases on license plate features. This proposed method consists of two main processes. First, candidate regions extraction step, Sobel operator is applied to obtain vertical edges and then potential candidate regions are extracted by deploying mathematical morphology operations [5]. Last, license plate verification step, this step employs the standard deviation of license plate features to confirm license plate position. The experimental results show that the proposed method can achieve high quality license plate localization results with high accuracy rate of 98.26 %

    Synthetic data generation in finance: requirements, challenges and applicability

    Get PDF
    Financial datasets possess susceptible, private and identifiable details about clients. The usage and distribution of such data for research outside a financial institution are strictly constrained due to privacy laws. One option to deal with this restriction is creating artificial data. The generation of fake data protects the confidentiality of customers' information. Data privacy is a prime concern in public opinion. This research study reviews various requirements and challenges for data generative techniques and handling synthetic data in finance

    TheChain: a fast, secure and parallel treatment of transactions

    Get PDF
    The Smart Distributed Ledger (aka blockchain) has attracted much attention in recent years. According to the European Parliament, this technology has the potential to change the lives of many people. The blockchain is a data structure built upon a hashed function in a distributed network, enabled by an incentive mechanism to discourage malicious nodes from participation. The consensus is at the core of the blockchain technology, and is driven by information embedded into a data structure that takes many forms such as linear, tree, and graph chains. The found related information will be subject to various validation incentives among the miners, such as proof of stake and proof of work. However, all the existing solutions suffer from a heavy state transition before dealing with the problem of a validation mechanism which suffers from resource consumption, monopoly or attacks. This work raises the following question: "Why is there a need for consensus where all participants can make a quick and correct decision?", and underlines the fact that sometimes ledger is subject to maintenance from regional parties in the data that leads to partial territories and eliminates monopoly, which is the hurdle to eliminating the trusted party. The validity of the blockchain transaction comes from the related information scattered above the data structure, and the authenticity lies in the digital signature. The aim is to switch from a validator based on incentives to a broadcaster governed by an unsupervised clustering algorithm, and the integrity does lie in the intersection among regions. However, the data structure takes advantage of the Petri network regarding its suitability. Building the entire ledger in the Petri network model will allow parallel processing of the transactions and securing of the total order between the participants on the memory reference layer. Moreover, it takes account of validation criteria quickly and safely before adding the new transaction list using the graph reachability

    Comparative analysis using supervised learning methods in anti-money laundering of Bitcoin data

    Get PDF
    With the advance of Bitcoin technology, money laundering has been incentivised as a den of Bitcoin blockchain, in which the user's identity is hidden behind a pseudonym known as address. Although this trait permits concealing in the plain sight, the public ledger of Bitcoin blockchain provides more power for investigators and allows collective intelligence for anti-money laundering and forensic analysis. This fascinating paradox arises in the strength of Bitcoin technology. Machine learning techniques have attained promising results in forensic analysis, in order to spot suspicious behaviour in Bitcoin blockchain. This paper presents a comparative analysis of the performance of classical supervised learning methods using a recently published data set derived from Bitcoin blockchain, to predict licit and illicit transactions in the network. Besides, an ensemble learning method is utilised using a combination of the given supervised learning models, which outperforms the given classical methods. This experiment is performed using a newly published data set derived from Bitcoin blockchain. Our main contribution points out that using ensemble learning approach outperforms the performance of the classical learning models used in the original paper, using Elliptic data set, a time series of Bitcoin transaction graph with node transactions and directed payments flow edges. Using the same data set, we show that we are able to predict licit/illicit transactions with an accuracy of 98.13% and F1 score equals to 83.36% using the proposed method. We discuss the variety of supervised learning methods, and their capabilities of assisting forensic analysis, and propose future work directions

    Effect of data resampling on feature importance in imbalanced blockchain data: Comparison studies of resampling techniques

    Get PDF
    Cryptocurrency blockchain data encounters a class-imbalance problem due to only a few known labels of illicit or fraudulent activities in the blockchain network. For this purpose, we seek to provide a comparison of various resampling methods applied to two highly imbalanced datasets derived from the blockchain of Bitcoin and Ethereum after further dimensionality reductions, unlike previous studies on these datasets. Firstly, we study the performance of various classical supervised learning methods to classify illicit transactions/accounts on Bitcoin/Ethereum datasets, respectively. Consequently, we apply a variety of resampling techniques to these datasets using the best performing learning algorithm on each of these datasets. Subsequently, we study the feature importance of the given models, wherein the resampled datasets have revealed a direct influence on the explainability of the model. Our main finding is that undersampling using the edited nearest-neighbour technique has attained an accuracy of more than 99% on the given datasets by removing the noisy data points from the whole dataset. Moreover, the best-performing learning algorithms have shown superior performance after feature reduction on these datasets in comparison to their original studies. The matchless contribution lies in discussing the effect of the data resampling on feature importance which is interconnected with explainable artificial intelligence techniques

    Improving Cancer Detection Classification Performance Using GANs in Breast Cancer Data

    Get PDF
    Breast cancer is one of the most prevalent cancers in women. In recent years, many studies have been conducted in the breast cancer domain. Previous studies have confirmed that timely and accurate breast cancer detection allows patients to undergo early treatment. Recently, Generative Adversarial Networks have been applied in the medical domain to synthetically generate image and non-image data for diagnosis. However, the development of an effective classification model in healthcare is difficult owing to the limited datasets. To address this challenge, we propose a novel K-CGAN method trained in different settings to generate synthetic data. This study applied five classification methods and feature selection to non-image Wisconsin Breast Cancer data of 357 malignant and 212 benign cases for evaluation. Moreover, we used recall, precision, accuracy, and F1 Score on the synthetic data generated by the K-CGAN model to verify the classification performance of our proposed K-CGAN. The empirical study shows that K-CGAN performed well with the highest stability compared to the other GAN variants. Hence, our findings indicate that the synthetic data generated by K-CGAN accurately represent the original data

    Generating synthetic data for credit card fraud detection using GANs

    Get PDF
    Deep learning-based classifiers for object classification and recognition have been utilized in various sectors. However according to research papers deep neural networks achieve better performance using balanced datasets than imbalanced ones. It’s been observed that datasets are often imbalanced due to less fraud cases in production environments. Deep generative approaches, such as GANs have been applied as an efficient method to augment high-dimensional data. In this research study, the classifiers based on a Random Forest, Nearest Neighbor, Logistic Regression, MLP, Adaboost were trained utilizing our novel K-CGAN approach and compared using other oversampling approaches achieving higher F1 score performance metrics. Experiments demonstrate that the classifiers trained on the augmented set achieved far better performance than the same classifiers trained on the original data producing an effective fraud detection mechanism. Furthermore, this research demonstrates the problem with data imbalance and introduces a novel model that's able to generate high quality synthetic data

    Effective Feature Engineering and Classification of Breast Cancer Diagnosis: A Comparative Study

    Get PDF
    : Breast cancer is among the most common cancers found in women, causing cancer-related deaths and making it a severe public health issue. Early prediction of breast cancer can increase the chances of survival and promote early medical treatment. Moreover, the accurate classification of benign cases can prevent cancer patients from undergoing unnecessary treatments. Therefore, the accurate and early diagnosis of breast cancer and the classification into benign or malignant classes are much-needed research topics. This paper presents an effective feature engineering method to extract and modify features from data and the effects on different classifiers using the Wisconsin Breast Cancer Diagnosis Dataset. We then use the feature to compare six popular machine-learning models for classification. The models compared were Logistic Regression, Random Forest, Decision Tree, K-Neighbors, Multi-Layer Perception (MLP), and XGBoost. The results showed that the Decision Tree model, when applied to the proposed feature engineering, was the best performing, achieving an average accuracy of 98.64%

    Uncertainty estimation-based adversarial attacks: a viable approach for graph neural networks

    Get PDF
    Uncertainty estimation has received momentous consideration in applied ma- chine learning to capture model uncertainty. For instance, the MonteCarlo dropout method (MC-dropout), an approximated Bayesian approach, has gained intensive attention in producing model uncertainty due to its simplicity and efficiency. However, MC-dropout has revealed shortcomings in capturing erroneous predictions lying in the overlapping classes. Such predictions underlie noisy data points that can neither be reduced by more training data nor detected by model uncertainty. On the other hand, Monte-Carlo based on adversarial attacks (MC- AA), an outstanding method, performs perturbations on the inputs using the adversarial attack idea to capture model uncertainty. This method admittedly mitigates the shortcomings of the previous methods by capturing wrong labels in overlapping regions. Motivated by this method that was only validated with neural networks, we sought to apply MC-AA on various graph neural network models to obtain uncertainties using two public real-world graph datasets known as Elliptic and GitHub. First, we perform binary node classifications then we apply MC-AA and other recent uncertainty estimation methods to capture models’ uncertainty. Uncertainty evaluation metrics are computed to evaluate and compare the performance of model uncertainty. We highlight the efficacy of MC-AA in capturing uncertainties in graph neural networks wherein MC-AA outperforms other given methods
    corecore